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Abstract 

Recent technology reviews have identified the need for 
objective assessments of engine health management (EHM) 
technology. The need is two-fold: technology developers 
require relevant data and problems to design and validate new 
algorithms and techniques while engine system integrators and 
operators need practical tools to direct development and then 
evaluate the effectiveness of proposed solutions. This paper 
presents a publicly available gas path diagnostic benchmark 
problem that has been developed by the Propulsion and Power 
Systems Panel of The Technical Cooperation Program (TTCP) 
to help address these needs. The problem is coded in 
MATLAB (The MathWorks, Inc.) and coupled with a non- 
linear turbofan engine simulation to produce “snap-shot” 
measurements, with relevant noise levels, as if collected from 
a fleet of engines over their lifetime of use. Each engine 
within the fleet will experience unique operating and deterio- 
ration profiles, and may encounter randomly occurring rele- 
vant gas path faults including sensor, actuator and component 
faults. The challenge to the EHM community is to develop gas 
path diagnostic algorithms to reliably perform fault detection 
and isolation. An example solution to the benchmark problem 
is provided along with associated evaluation metrics. A plan is 
presented to disseminate this benchmark problem to the 
engine health management technical community and invite 
technology solutions. 

Introduction 

A recent technology review has revealed that while Engine 
Health Management (EHM) related research and development 


’Currently employed by the NASA Glenn Research Center. 


has increased significantly in recent years, there exists a 
fundamental inconsistency in defining and representing EHM 
problems (ref. 1). Currently many of the EHM solutions 
published in the open literature are applied to different plat- 
forms, with different levels of complexity, addressing different 
problems, and using different metrics for evaluating perform- 
ance. As such it is difficult to perform a one-to-one 
comparison of candidate approaches. Furthermore, these 
inconsistencies create barriers to effective development of new 
algorithms and the exchange of EHM-related ideas and results. 

Past efforts have made progress towards addressing these 
issues. Several authors have presented results from their self- 
conducted comparative assessments of gas path diagnostics 
methods (refs. 2 to 4). On a broader scale, the On Board 
Identification, Diagnosis and Control of Gas Turbine Engines 
(OBIDICOTE) Project conducted by the European research 
community defined a common set of gas turbine engine fault 
cases which were used by several researchers for diagnostic 
method development and evaluation (refs. 5 to 7). To help 
further address these issues, and to facilitate international 
cooperation, an Engine Health Management Industry Review 
(EHMIR) effort has been initiated under the auspices of The 
Technical Cooperation Program (TTCP), Aerospace Systems 
Group, Propulsion and Power Systems Panel. TTCP is a forum 
for defense science and technology collaboration between 
Australia, Canada, New Zealand, the United Kingdom and the 
United States (ref. 8). The EHMIR will provide reference, or 
theme problems, to aid in technology development and evalua- 
tion. Gas path and vibration sub-teams are currently develop- 
ing theme problems that will provide relevant challenging 
problems. The objective of this effort is to construct and 
disseminate EHM theme problems, and invite solutions from 
the EHM community. Following a period of individual devel- 
opment, it is proposed that a conference be held to present the 
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results with blind test case evaluations. The overall goal is to 
provide a set of environments that will truly facilitate the 
development and evaluation of significant EHM capabilities. 

This paper specifically covers the progress on defining the 
gas path diagnostic theme problem and the associated metrics 
for benchmarking the performance of diagnostic solutions. It 
is organized as follows. First, the proposed public approach 
towards benchmarking gas path diagnostic methods is intro- 
duced. This consists of: 1) an Engine Fleet Simulator (EFS) to 
generate simulated test cases with implanted faults and degra- 
dation; 2) a description of the interface between EFS outputs 
and user developed diagnostic solutions; and 3) a description 
of proposed metrics for evaluating the performance of candi- 
date diagnostic solutions. Next, an example diagnostic solu- 
tion is given, and its performance against the defined metrics 
is presented. Finally, the follow-on steps for disseminating the 
problem to, and inviting technology solutions from, the EHM 
community are discussed. Technology developers, original 
equipment manufacturers (OEM) and operators are invited to 
provide their comments and feedback on any aspect of the 
proposed approach. 


Nomenclature 


C-MAPSS 

EFS 

EGT 

EHM 

EHMIR 

FPR 

GUI 

HPC 

HPT 

LPC 

LPT 

OBIDICOTE 

OEM 

PCC 

PLA 

TPR 

TTCP 

C 

H 


Commercial Modular Aero-Propulsion 

System Simulation 

Engine Fleet Simulator 

Exhaust Gas Temperature 

Engine Health Management 

Engine Health Management Industry 

Review 

False Positive Rate 

Graphical User Interface 

High Pressure Compressor 

High Pressure Turbine 

Low Pressure Compressor 

Low Pressure Turbine 

On Board Identification, Diagnosis and 

Control of Gas Turbine Engines 

original equipment manufacturer 

Percent Correctly Classified 

power lever angle 

true positive rate 

The Technical Co-operation Program 

confusion matrix 

fault influence coefficient matrix 


Nc core speed 

Nf fan speed 

NfR Corrected fan speed 

P 0 State covariance matrix 

PI 5 total pressure in bypass-duct 

P2 total pressure at fan inlet 


P24 

Pamb 

PCNfR 

P.s-30 

R 

T2 

T24 

730 

T48 

T4SR 

VBV 

VSV 

Wf 

ema 

e j 


m 

n 

psia 

rpm 

y,(k) 

y /baseline(^) 


a 

P 

Y 

ATamb 

A x(k) 
Ar(£) 


total pressure at LPC outlet 

ambient pressure 

percent corrected fan speed 

static pressure at HPC outlet 

measurement noise covariance 

total temperature at fan inlet 

total temperature at LPC outlet 

total temperature at HPC outlet 

total interstage turbine temperature 

corrected interstage turbine temperature 

Variable Bleed Valve 

Variable Stator Vane 

fuel flow 

exponential moving average 

normalized measurement estimation error of 

hypothesized fault case j 

number of sensor measurements 

number of single fault types 

pounds per square inch absolute 

revolutions per minute 

Measurement i collected on flight k 

Fleet average engine value of measurement i 

at flight k operating conditions 

moving average smoothing constant 

delay in measurement delta-delta calculation 

component flow capacity adder 

difference between ambient temperature and 

standard atmospheric temperature 

fault magnitude vector 

estimated fault magnitude vector 


A yi(k) measurement delta - difference between 

actual and fleet average engine in 
measurement i at flight k 

kyi_ema{k) exponential moving average of measurement 

delta i on flight k 

A Ay,- rm „(k) measurement delta-delta-difference in ema 
of measurement i at flight k relative to some 
previous flight 

AAi j ema (k) estimated measurement delta-delta 


AA Y(k) vector of measurement delta-deltas 

i] component efficiency adder 

k kappa coefficient 

g, standard deviation of measurement i 


Subscripts 

i measurement index 

j fault index 

k flight index 
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p confusion matrix row index 

q confusion matrix column index 

Superscript 

T transpose operator 

Gas Path Diagnostic Benchmark Process 

The goal of aircraft engine gas path diagnostics is to relia- 
bly assess and manage the health of gas turbine engine flow- 
path components. It is performed by relating observed changes 
in measured engine gas path variables (typically the suite of 
control feedback sensors) to changes in engine module per- 
formance, engine system malfunctions, or instrumentation 
problems. An inherent requirement is knowledge of gas- 
turbine parameter interrelationships. Gas path diagnostic 
functionality can reside both on-board and off-board the 
aircraft. Typical measurements include pressures, tempera- 
tures, rotor speeds, and fuel flow. The number of sensors 
available varies depending on engine type and model, but 
applications of four to eight sensors are common. 

Ground-based gas-path health management systems rely on 
data acquired on-board the aircraft within the engine control 
computer. Although an engine controller may sample sensor 
data at 20 to 50 Hz, typically only a fraction of the data is 
transferred to ground-based health management systems for 
analysis. While new aircraft data acquisition systems are 
improving the quantity and quality of archived data relative to 
legacy aircraft, the infrastructure required to effectively 
archive and manage all sensed engine data generated from the 
flight of every engine is often not practical. A typical applica- 
tion will instead record “snap-shot” engine gas-path measure- 
ments collected at two to three operating points during each 
flight. For a commercial application, which follows a rela- 
tively consistent flight profile, this may include recordings at 
ground-idle, max EGT, and cruise conditions. Military appli- 
cations, which exhibit much greater variability in mission 
profiles, may be limited to recordings at ground idle and 
takeoff. When possible, the data acquisition system may wait 
until the engine achieves quasi-steady-state operation, and 
then archive the average sensor measurements recorded over a 
window of time. 


The proposed process for benchmarking aircraft engine gas 
path diagnostic methodologies is presented in figure 1. It 
specifically focuses on diagnostic methods applied to “snap 
shot,” or discrete, engine measurements. The intent is to 
provide users a publicly available toolset to enable the devel- 
opment, evaluation, and side-by-side comparison of candidate 
diagnostic solutions. This includes providing the functionality 
to evaluate analytical, empirical or hybrid (analytical + em- 
pirical) diagnostic approaches. The elements of the benchmark 
process include an EFS that generates sensed parameter 
histories as if collected from a fleet of engines over their 
lifetime of use; user developed diagnostic solution(s) that 
interpret EFS generated parameter histories to diagnose any 
faults; and a routine to automatically evaluate the performance 
of candidate diagnostic solutions against a predefined set of 
metrics. Each element is discussed in more detail in the 
sections that follow. 

Engine Fleet Simulator 

Gas path diagnostic algorithm development and validation 
requires access to engine models and data. Ideally, this would 
include a rich database of information collected from engines 
over a broad range of operating conditions, deterioration 
levels, and known fault and no-fault conditions. However, to 
facilitate a public benchmarking approach, it was decided to 
generate simulated engine data utilizing a publicly available 
turbofan engine simulation. This approach avoids the use of 
engine data and analytical models that contain proprietary 
information. While an engine simulation will never fully 
capture all the nuances contained in actual engine data, it does 
provide some advantages. For example, it will allow the 
simulation of a broader range of fault types and magnitudes 
occurring over a broader range of engine operating conditions. 
It will also provide unambiguous knowledge of an engine’s 
true fault/no-fault state, or “ground truth” condition. 

The benchmark problem has been constructed as an EFS to 
generate histories of “snap shot” engine parameters collected 
from each engine on each flight. The EFS architecture, shown 
in figure 2, is implemented in the MATLAB environment. 
This architecture consists of a graphical user interface (GUI) 
that accepts user specified inputs regarding the number of 
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parameter 

histories 
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Developed 

Diagnostic 

Solutions 


“Ground truth” engine condition 



Figure 1. — Gas path diagnostic benchmark process. 
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Graphical User 
Interface (GUI) 



• Number of engines 

• Number of faults 

• Flights per engine 

• Flight of fault initiation 

• Fault evolution rate 

• Sensor noise on/off 

► 


Case Generator 



▲ 




Operating conditions 
Component 
deterioration profiles 
Fault type, magnitude, 
initiation flight, and 
evolution rate 


C-MAPSS 
Steady-State 
Engine Model 



Sensed output 
parameter histories 
Ground truth fault 
information: 

-Type & magnitude 
-Flight of initiation 
-Evolution rate 


Parameter Histories 



Figure 2. — EFS architecture. 


engines and the number of occurrences of each fault type, a 
case generator designed to produce unique fault effects and 
operating profiles for each engine in the fleet, and a non-linear 
steady-state turbofan engine simulation which produces 
sensed parameter histories for each engine in the fleet. Each 
component of the EFS is further described below. 

EFS Graphical User Interface 

The EFS GUI, shown in figure 3, is designed to provide 
flexibility in generating data sets for diagnostic development 
and validation purposes. Through this interface, the type and 
number of faults that occur within the fleet of engines are 
defined. It should be emphasized that the EFS has been 
designed assuming that an individual engine can only experi- 
ence a single fault— it will not simulate multiple faults occur- 
ring within the same engine. There are 18 possible fault 
scenarios plus the no-fault scenario. The sum of the number of 
occurrences of each scenario determines the total number of 
engines in the fleet. The interface also allows the user to 
specify the following: the number of flights over which output 
data will be collected for each engine; the flight of fault 
initiation (either at a fixed flight number or randomly within a 
specified window of flights); the rate at which faults evolve, 
either abruptly (instantaneously) or rapidly (over a number of 
flight cycles); and sensor noise turned on or off. 


EFS Case Generator 

After the EFS inputs have been specified through the GUI, 
the user selects the “Run EFS” button which initiates the 
process of generating engine parameter histories according to 
the number and type of scenarios specified by the user. The 
Case Generator randomly assigns a unique operating history 
and component deterioration profile to each engine in the 
fleet. This includes assigning the city pairs that an engine will 
fly between, and the calendar date on which engine data 
collection will commence. Altitude, Mach number, ambient 
temperature and power setting parameters at the takeoff and 
cruise operating points where data will be collected during 
each flight are all randomly generated from specified 
distributions. At takeoff, the power reference parameter is 
established by the Power Lever Angle (PLA) setting which 
will either be 100 percent, or a fixed de-rated takeoff power 
setting of 90 or 80 percent. At cruise, the power reference 
parameter is specified by net thrust. Flistograms showing the 
distributions in operating parameters produced by the EFS 
Case Generator over 100,000 flights are shown in figure 4 
(takeoff) and figure 5 (cruise). 

Deterioration effects are simulated via adjustments to 10 
health parameters within the engine simulation so that engines 
will continuously degrade over time. These health parameters 
include an efficiency scalar and a flow adder for each of the 
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Figure 3. — EFS graphical user interface. 
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Figure 4. — Takeoff operating condition distributions. 
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Cruise Tamb fF) 



5000 6000 7000 8000 

Cruise Net Thrust - Fn (lbs) 


Figure 5. — Cruise operating condition distributions. 


five major modules in the engine (Fan, Low Pressure Com- 
pressor (LPC), High Pressure Compressor (HPC), High 
Pressure Turbine (HPT), and Low Pressure Turbine (LPT)). 
Published information on turbofan performance deterioration 
profiles based upon historical data can be found in refer- 
ences (refs. 9 to 11). The fleet average deterioration profile 
implemented within the EFS is based on information provided 
in reference 10, although adjustments have been made to the 
turbine health parameters to cause them to deteriorate on a 
time scale consistent with the rest of the engine. Variations to 
the fleet average deterioration profile are included, via ran- 
domly assigned scale factors and adders, to produce a unique 
deterioration profile for each individual engine including: 
1) More/less rapid overall engine deterioration; 2) More/less 
rapid individual module deterioration; 3) More/less module 
flow deterioration relative to efficiency deterioration; and 
4) initial engine-to-engine manufacturing variation. Figure 6 
shows the baseline, or average, profile for each health parame- 
ter (in red) and example variations (in cyan). The cyan points 
were generated by running the EFS to collect data from 100 
engines over 5000 flights. 5000 flights is the maximum 
number of flight cycles that can be defined for any engine in 
the EFS. However, the user can choose fewer flights. If so, an 
engine’s starting condition will be randomly placed some- 
where along the deterioration profiles defined in figure 6. 

In addition to generating the operating history and deterio- 
ration profile for each engine, the Case Generator will also 


define the type of fault each engine will experience, the flight 
of fault initiation, the fault magnitude, and the fault evolution 
rate. A summary of the fault types and their uniformly distrib- 
uted fault magnitudes is provided in table 1 . Component faults 
(i.e., Fan, LPC, HPC, HPT and LPT) are simulated by simul- 
taneously adjusting the efficiency, r|, and flow capacity, y, 
health parameters of the component. Component fault magni- 
tude distributions shown in the table are in terms of the root- 
sum-square value of the combinedq and y deviations. The 
uniformly distributed ratios of flow capacity to efficiency 
health parameter adjustment, y:r| ra ,j 0 , are also shown. For Fan, 
LPC, and HPC faults, both y and r| are reduced. For HPT and 
LPT faults, y is reduced while q is increased. Component 
faults are simulated by adjusting efficiency and flow capacity 
as follows: 

, . fault magnitude 

r| adjustment = 

V 1 + (y :T lratio) 2 W 

y adjustment = q adjustment ■ (y : q rat ; 0 ) 

Currently, Variable Bleed Valve (VBV) and Variable Stator 
Vane (VSV) actuator faults are modeled via adjustments to 
LPC and HPC flow capacity respectively. In the future, more 
detailed actuator models will be included to allow off- 
schedule VBV and VSV faults to be simulated. Sensor fault 
magnitudes are reflected in units of measurement standard 
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Figure 6. — Health parameter deterioration profiles, 
average (red) and distribution (cyan). 
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TABLE 1.— EFS FAULT TYPES 


Fault 

ID 

Fault 

description 

Fault 

magnitude 

Fault 

y:r| ratio 

Takeoff 
average a 

Cruise 
average a 

0 

No-fault 









1 

Fan fault 

1 to 7% 

1 to 2 



2 

LPC fault 

1 to 7% 

1 to 2 



3 

HPC fault 

1 to 7% 

1 to 2 



4 

HPT fault 

1 to 7% 

-0.5 to -1 



5 

LPT fault 

1 to 7% 

-0.5 to -1 



6 

VSV fault 

1 to 7% 




7 

VBV fault 

1 to 7% 




8 

Nf sensor fault 

± 1 to 7 CT 


5.59 rpm 

4.64 rpm 

9 

Nc sensor fault 

± 1 to 7 ct 


15.10 rpm 

13.23 rpm 

10 

PI 5 sensor fault 

± 1 to 7 ct 


0.042 psia 

0.013 psia 

11 

P24 sensor fault 

± 1 to 7 CT 


0.054 psia 

0.016 psia 

12 

Ps 30 sensor fault 

± 1 to 7 CT 


0.889 psia 

0.220 psia 

13 

724 sensor fault 

± 1 to 7 CT 


1.01 °R 

0.82 °R 

14 

730 sensor fault 

± 1 to 7 CT 


2.47 °R 

1.92 °R 

15 

748 sensor fault 

± 1 to 7 CT 


10.20 °R 

7.55 °R 

16 

Wf sensor fault 

± 1 to 7 CT 


0.058 pps 

0.011 pps 

17 

P2 sensor fault 

± 1 to 7 CT 


0.023 psia 

0.008 psia 

18 

72 sensor fault 

± 1 to 7 CT 


0.838 °R 

0.707 °R 


deviation, o, which varies proportional to the sensed parame- 
ter magnitude. Average a values at takeoff and cruise are 
shown in the table. 

C-MAPSS Engine Model 

The outputs of the Case Generator are provided as inputs to 
the NASA Commercial Modular Aero-Propulsion System 
Simulation (C-MAPSS) high-bypass turbofan engine model. 
C-MAPSS is a transient non-linear aerothermodynamic engine 
model equipped with an associated closed-loop control sys- 
tem. It has been developed for controls and diagnostics re- 
search and development purposes (ref. 12). A modified 
steady-state version of C-MAPSS is implemented within the 
EFS to facilitate faster convergence speeds. For each engine, 
at each flight operating point, the C-MAPSS engine model is 
ran to the steady-state operating conditions specified by the 
Case Generator. Logic is applied to ensure that control limits 
are not violated in generating the steady-state solutions. End 
users will be provided access to the C-MAPSS steady-state 
model for diagnostic solution development purposes. For 
example, fault influence coefficient matrices can be extracted 
from C-MAPSS for use in the design of model-based diagnos- 
tic methods. 

Parameter Histories and Fault Information 

After the simulated engine parameter histories are produced 
by C-MAPSS, they are archived to a MATLAB file. This 
includes three-dimensional matrices containing sensed infor- 
mation collected at takeoff and cruise of dimension Number of 
Engines, by Number of Flights, by 12 Sensed Outputs. The 12 


sensed outputs are shown in table 2. In addition to parameter 
histories, the EFS also stores the associated fault information 
for each engine in the fleet. This includes the fault type, fault 
magnitude, flight of fault initiation and fault evolution rate. 
This “ground truth” fault information will allow users to 
evaluate the overall performance of their developed diagnostic 
solutions. Once an EFS output dataset has been generated, it 
can be stored and used for future diagnostic algorithm devel- 
opment and evaluation purposes. 


TABLE 2.— EFS SENSOR OUTPUT PARAMETERS 


Index 

Symbol 

Description 

Units 

1 

w 

physical fan speed 

rpm 

2 

Nc 

physical core speed 

rpm 

3 

P 15 

total pressure in bypass-duct 

psia 

4 

P24 

total pressure at LPC outlet 

psia 

5 

724 

total temperature at LPC outlet 

°R 

6 

P.v30 

Static pressure at HPC outlet 

psia 

7 

730 

total temperature at HPC outlet 

°R 

8 

748 

total temperature at HPT outlet 

°R 

9 

Wf 

fuel flow 

pps 

10 

P2 

total pressure at fan inlet 

psia 

11 

72 

total temperature at fan inlet 

°R 

12 

Pamb 

ambient pressure 

psia 


User Developed Diagnostic Solutions 

As shown in figure 1, outputs from the EFS are provided as 
test case datasets for the development and validation of user 
developed diagnostic solutions. The challenge is to develop 
robust diagnostic methods to correctly diagnose the occur- 
rence of any faults contained within the provided datasets, 
with minimal false alarms, missed detections, mis- 
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classifications, and detection latency. Diagnostic solutions 
shall produce an assessment for each engine, each flight, based 
solely on the sensed parameters shown in table 2 collected at, 
and prior to, the current flight. Although sensed engine outputs 
from upcoming flights and ground-truth fault information will 
also be available, it is not permissible to utilize this informa- 
tion for conducting diagnosis. As an output, each diagnostic 
algorithm shall produce a diagnostic assessment for each 
engine. These assessments are to be stored in a specified 
MATLAB format to facilitate automated evaluation against 
defined evaluation metrics. Diagnostic assessments will be 
stored in the form of a two-dimensional matrix of dimension 
number of engines by two, where each row corresponds to an 
individual engine, and the two columns consist of the Fault ID 
(as defined in table 1), and the flight of fault detection. For 
instances of no-fault found, zeros are to be recorded in each 
column. 

Diagnostic Metrics 

In order to enable a side-by-side comparison of candidate 
diagnostic approaches it is not only necessary to apply the 
candidate approaches to a common diagnostic problem, but 
also to evaluate their respective performance against a stan- 
dard set of evaluation metrics. The importance of quantitative 
measures of effectiveness for diagnostic algorithms has been 
identified for gas turbine engine health management systems 
(refs. 13 to 16). These measures include performance metrics 
(e.g. thresholds, accuracy, reliability, sensitivity) as well as 
effectiveness metrics (e.g. complexity and cost). Metrics 
applications have also been developed to guide the selection of 
adequate data sets so that reliable statistics can be calculated 
(ref. 17). A complete coverage of key metrics is provided in 
the Society of Automotive Engineers Aerospace Recom- 
mended Practice Document 5783 (ref. 18). 

For the TTCP gas path diagnostic benchmark process pre- 
sented herein, a uniform set of metrics will be defined to 
evaluate the performance of diagnostic solutions developed 
and applied. Solution developers will be provided these 
metrics to enable them to independently evaluate the perform- 
ance of their individual solutions given the ground-truth fault 
output information produced by the EFS. Although the com- 
plete list of metrics to be applied within the benchmark 
process are still being defined, they will be established to 
evaluate diagnostic accuracy (missed detections, false alarms, 
mis-classifications, and correct classifications), as well as 
detection latency (time required to detect a fault after fault 
initiation). These will include measures of overall diagnostic 
performance, provided by detection decision matrices and 
classification confusion matrices, as well as metrics which 
summarize overall diagnostic performance to enable a direct 
comparison between candidate algorithms, such as the Kappa 
Coefficient (ref. 18) or Normalized Product Entropy Ratio 
(ref. 16). The detection decision matrix, as shown in figure 7, 
demonstrates an algorithm’s ability to discriminate between 


Predicted State 



Fault 

No Fault 

Fault 

True Positives 

False Negatives 
(missed 
detections) 

No Fault 

False Positives 
(false alarms) 

True Negatives 


Figure 7. — Detection decision matrix. 
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Predicted State 


Fault 1 

Fault 2 


Fault n 



Figure 8. — Classification confusion matrix. 


fault and no-fault cases. Here, the diagonal reflects the number 
of correct predictions. 

From this matrix. True Positive Rate (TPR), and False Posi- 
tive Rate (FPR) detection metrics can be readily calculated: 


TPR = 


True Positives 

True Positives + False Negatives 


• 100 % 


FPR = 


False Positives 

False Positives + Tme Negatives 


• 100 % 


(2) 


Although the detection decision matrix provides informa- 
tion on an algorithm’s fault-detection capability, additional 
information is required to evaluate an algorithm’s ability to 
discriminate, or classify, between multiple fault types. This 
will be reflected in a confusion matrix as shown in figure 8. 
The confusion matrix, denoted here as C, is a square matrix of 
dimension n, where n is the number of fault types. The 
“no-fault case” can also be included in the confusion matrix, 
although it has been excluded in the example shown here. 

The diagonals of this square matrix reflect correct classifi- 
cations. The Percent Correctly Classified (PCC) for the p' h 
fault type, PCC p , can be calculated by dividing the number of 
correct classifications of fault p, by the total number of fault p 
observations: 
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c 

PCC p = — 100% (3) 

2 y Pq 

9=1 

While decision and confusion matrices provide an overall 
assessment of diagnostic algorithm performance, they are not 
in a form that enables easy comparison of different algorithms, 
especially when dealing with multiple fault classes. An addi- 
tional metric, which summarizes the content of a confusion 
matrix into a single parameter, is the Kappa Coefficient 
(ref. 18). It is a measure of an algorithm’s ability to correctly 
classify a fault, which takes into account the expected number 
of correct classifications occurring by chance. The Kappa 
Coefficient, denoted here as k, can be calculated from the 
elements of a confusion matrix as follows: 

_ ^(correctly classifed)- ^(expected correct by chance) 
7V(total)- ^(expected correct by chance) 

where 

n 

^(correctly classified) = lO, (4) 

p = i 


matically evaluates the performance of a diagnostic solution 
against the defined metrics, will be distributed along with the 
EFS. 

Gas Path Diagnostic Benchmark 
Process — Example Diagnostic Solution 

The previous sections outlined a process for developing and 
benchmarking candidate gas path diagnostic methodologies. In 
this section, an example diagnostic solution is presented and 
evaluated against the initial list of metrics described above. It 
is presented to demonstrate how a user defined diagnostic 
solution would integrate into the process. This example should 
not be interpreted as the recommended approach for solving 
the problem. It is simply provided as an example solution to 
illustrate the overall diagnostic process, and to serve as a 
template for the development of additional diagnostic solu- 
tions. The overall example solution, shown in figure 9, is 
partitioned into a three-step process consisting of: 1) trend 
monitoring; 2) anomaly detection; and 3) event isolation. Each 
of these steps is further discussed below. 

Example Solution: Step 1 — Trend Monitoring 


ivftotal) = 2 j 2j Cpq 

P = 1 9=1 

^(expected correct by chance) = III 


C 


pq 


^[^Mtotal) £ 


Yc 1 

Z-i 9P I 


If a diagnostic method achieves perfect fault classification 
performance then k = 1. If its classification performance is 
worse than that expected by chance then k < 0. The Kappa 
Coefficient along with detection and confusion matrices form 
an initial set of metrics to be included in the gas path diagnos- 
tic benchmark process. As additional metrics are defined, they 


The gas path diagnostic benchmark problem presented in this 
paper has been constructed assuming that engine performance 
changes can manifest themselves in two ways: a) gradual 
(long-term) deterioration or b) rapid (short-term) deterioration. 
The former is due to all of the engine components deteriorat- 
ing slowly over time and is included here in an attempt to 
emulate physical causes such as erosion, corrosion, fouling 
and increased clearances within the turbomachinery. The latter 
is due to a single fault occurring. An effective gas path diag- 
nostic solution must be able to function with both of these 
processes occurring and interacting simultaneously, and must 
be able to discriminate between the two cases, without cor- 
rupting the overall diagnostic approach. 


will be included as well. A MATLAB routine, which auto- 


Engine 

sensed 

parameters 

y 


Engine 

operating 

conditions 



Figure 9. — Example solution process. 
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Figure 10. — Corrected T48 measurement deltas and 
corresponding exponential moving averages. 


In the example solution presented herein, a trend monitor- 
ing approach is applied to capture gradual performance 
changes in the form of residuals, or measurement deltas, 
relative to a fleet average engine. Although the defined met- 
rics will not require the inclusion of trend monitoring func- 
tionality within a diagnostic solution, it is expected to improve 
overall diagnostic performance, and is thus included as part of 
the example solution. The C-MAPSS engine model was run 
over a range of altitude, Mach number, and corrected fan 
speed ( NfR ) settings at the 50 percent deterioration level in 
order to define a three-dimensional table lookup model used to 
represent a fleet average engine. Measured engine data col- 
lected during each flight are corrected to standard day condi- 
tions and then referenced against the fleet average engine to 
calculate measurement deltas, Ay,’s, as shown in equation (5) 

A y; (*) = y, {k) - T; _ baselined) (5) 

where y,(k) is the corrected value of the i‘ h measurement 
collected during the k ,h flight, and baseiineW is the fleet 
average engine value for the i‘ h measurement at the 
corresponding pressure altitude, Mach number and NfR values 
of the k ,h flight. (Note: pressure altitude is the altitude 
corresponding to Pamb as defined within Standard 
Atmosphere tables. Pressure altitude and Mach number can be 
calculated from Pamb and P2). A y,(k) value is only calculated 
for eight of the 12 measurements shown in table 2. The 
parameters Nf Pamb, P2, and 72 are used for establishing the 


engine operating point and/or parameter correction purposes 
and are thus excluded from the y,(k) calculations. The 
calculated measurement delta values are trended over time 
applying an exponential moving average approach as 
described in reference 19 and shown in equation (6) 

A Pi _ ema (Ar) = cx - Ay, _ ema (*-l)+(l -aj-Ay^k) (6) 

where Ay i ema (k) is the exponential moving average of the i‘ h 
measurement delta on flight k. The moving average weighting 
between previous and current data is established by the 
constant a (where 0 < a < 1). An example of corrected 748 
measurement deltas, A 748, and corresponding exponential 
moving average values (using an a of 0.885) from two 
randomly generated engines operating over 200 flight cycles is 
shown in figure 10. The exponential moving average can be 
seen to produce a smoothing effect on the overall trend shifts. 
Although it is not readily apparent from the plots, Engine 2 
experiences a 3 percent fan fault on flight cycle 100 while 
Engine 1 experiences no faults. While a fan fault will cause 
A 74 8 to decrease, the downward trend observed in Engine 2 
prior to fault initiation (between flights 90 to 100) is entirely 
coincidental and unrelated to the fault. 

Example Solution: Step 2 — Anomaly Detection 

The previous sub-section presented an example approach 
for monitoring gradual engine performance measurement 
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deltas over time. Next, anomaly detection logic is applied to 
detect discrete events causing a rapid shift in observed meas- 
urement deltas. It is important to recognize that any anomaly 
detection logic must function relying only on the measurement 
information collected at, and prior to, the current flight cycle, 
k. It is also important to recognize that it would be prudent to 
incorporate a time latency inherent in the detection process to 
avoid false alarms caused by statistical outliers in the meas- 
urement data. The approach applied herein for anomaly 
detection uses a backwards difference calculation of the 
exponential moving average of each measurement delta as 
shown in equation (7) 

AA yi_ema{k)- AVi AVi _ema(j i ~ P) (7) 

where AA y i ema (k), or the measurement delta-delta, is the 
change in the exponential moving average of the i th 
measurement delta between flight k and some previous flight, 
k- p. The measurement delta-deltas also provide direct utility in 
performing the fault isolation function as will be described 
later. Choosing a |3 = 10 flight cycle distance between the 
compared EMA values was found to provide detection 
capability for both abrupt as well as rapid faults. It is 
anticipated that improved detection robustness for abrupt as 


well as rapid faults could be obtained by combining multiple 
detection filters which apply different (:’> distances within the 
measurement delta-delta calculations. Figures 11 and 12 show 
the measurement deltas (top plot), exponential moving 
average values (middle plot), and measurement delta-delta 
values (bottom plot) for eight engine parameters ( Nc , /T5, 
P24, 7N30, 724, 730, T48, and Wf) collected from the two 
engines previously introduced in figure 10. In these figures 
each signal has been normalized by the standard deviation, a, 
of the measurement delta, Ay,, to enable a comparison between 
the measurements. Anomaly detection logic is applied which 
monitors for a AA y t ema (k) exceeding a ±2 ct, threshold. Engine 
1, shown in figure 11, does not contain a fault and all AA y t ema 
signals remain below the anomaly detection threshold for the 
entire 200 flight profile. Engine 2, shown in figure 12, 
contains a fan fault occurring on flight 100. Here one of the 
AA y i ema signals, (7N30), exceeds the anomaly detection 
threshold on flight 104 indicating that an anomaly is present. 
Note that the AA y i ema signals do not immediately exceed the 
detection threshold at the flight of fault occurrence, nor do 
they persistently exceed the threshold into the future. This is 
due to the combined effect of the exponential moving average, 
and the backwards difference calculations applied. 
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Figure 11 . — A y,/a„ A y;_ ema /<3\ and AAy,y ema /cr/ values for engine 1 (no fault). 


NASA/TM— 2008-2 1527 1 


12 


10 


b 

>T 

<1 














0 20 40 60 80 100 120 140 160 180 200 


- 




- 











ms: 

1 1 

l I I 

l l 

1 



0 20 40 60 80 100 120 140 160 180 200 



CTJ 


Detection ThreshoIcK^^ 








Detection ThreshoIcK^^ 

i i i i i i 

i 

1 1 



.4 l I I I I I I I l 

0 20 40 60 80 100 120 140 160 180 200 


Flight Cycle 

Figure 12. — A y,/cy;, Ay^ ema /o; and AAy,;_ ema /a,- values for engine 2 (fan fault occurring on flight 100). 


Example Solution: Step 3 — Event Isolation 

The final step in the example solution is event isolation. If 
the diagnostic logic detects an anomaly, it classifies a root 
cause for the detected anomaly. Following the steps described 
in the previous section, an anomaly signature in the measure- 
ment delta-delta space due to the underlying fault event is 
obtained: 


AAY(k) 


kby\_ema(k) 
^ 4) ; 2 _ ema (^0 

AAv,„ ema (&) 


(8) 


where m is the number of sensor measurements. 

Given this information a Kalman filter can be configured, 
as described in references 2 and 20, to operate as a single fault 
isolator conducting a snapshot type of analysis. The objective 
of this analysis is to identify the single fault cause of the event 
on the basis of the AAY(k) measurement delta-delta vector. 
This requires the generation of a fault influence coefficient 
matrix which relates engine faults to measurement delta-delta 
changes in engine outputs. We will denote the ( m*n ) fault 
influence matrix as II, where m = number of measurements, 
and n = number of single fault types. Assuming Ax(k) is an 
«x 1 vector representing the magnitudes of the n single fault 


types under consideration, the interrelationship between faults 
to measurement delta-delta changes can be written as: 

AA Y{k)= HAx(k) (9) 

For the gas path diagnostic benchmark problem m = 8 
measurements, and n = 18 single fault types (see table 1). The 
H matrix can be generated by running the C-MAPSS steady- 
state engine model to a fixed closed-loop operating condition 
(specified by altitude, Mach, ATamb and corrected Nf) and 
individually introducing each of the 18 fault types. The 
elements of the matrix consist of the partial derivatives 
relating the change in corrected measured engine outputs 
(fault condition vs. nominal condition) to the magnitude of the 
implanted fault. The H T matrix generated at the 35K ft, 0.78 
Mach, 0 ATamb, 84 percent PCNfR operating condition for a 
50 percent deteriorated engine is shown in table 3. Here, rotor 
speeds, pressures and fuel flow delta-delta shifts are shown in 
percent units, and temperature delta-delta shifts are shown in 
degrees Rankine. These delta-delta shifts are relative to a 1 
percent fault in all cases except for temperature sensor faults, 
which correspond to a 1 °R bias in the respective temperature 
sensor. 

Next, a single fault estimator can be constructed as: 

AJc(k)=P 0 H T (HP 0 H T +R)~ l AAY{k) (10) 
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where Ax(k) is an n x 1 vector of estimated fault magnitudes, 
P 0 is the state covariance matrix, R is the measurement noise 
covariance, and A A Y(k) is the vector of the observed 
measurement delta-deltas at the time of anomaly detection on 
flight k. The single fault isolation is obtained by processing 
(10) iteratively to provide a snapshot analysis for each of the n 
single fault cases under consideration. Each iteration of the 
above equation was made with a different P 0 matrix chosen to 
accentuate one of the single fault cases while zeroing out all 
others. Since this is a snapshot analysis, a covariance update 
calculation is not required. Through this iterative process an 
estimated fault magnitude, Axj(k), for each of the n single 
fault cases under consideration is produced. The index j, in 
A.X j (k ) , corresponds to the fault under consideration. The 

corresponding estimated measurement delta-delta vector for 
each of the n single fault cases can be calculated as: 

AAYj (k)= HAxj (k) (11) 

It is important to emphasize that in the above equation only 
the j th element of the Ax j(k) vector will be non-zero due to 

the selection of P 0 during the iterative evaluation process. The 
associated normalized measurement estimation error for each 
of the n fault cases is then calculated as: 


A A}/ ema ( k ) A Ay, ema (k ) 


i=l v 


(12) 


where AA y t ema (k ) is the i th element of the vector AAYj(k) 

and g, is the standard deviation of the i ,h measurement delta, 
A y h . The single fault estimate, A Xj(k), which produces the 

minimum normalized measurement estimation error is inferred 
to be the fault cause. Applying this isolation technique to the 
measurement delta-delta vector of the detected anomaly in 
figure 12 yielded the normalized measurement estimation 
error results shown in figure 13 for the 18 candidate single 
fault conditions. Here the estimated fan fault (fault ID 1) 
produced the minimum normalized measurement estimation 
error, and was correctly isolated as the cause of the anomaly. 



Figure 13. — Normalized measurement estimation error, e y , for 
18 single-fault candidates in the case of a fan fault. 


TABLE 3. — TRANSPOSE OF FAULT INFLUENCE COEFFICIENT MATRIX, H r 



AA V; 


Nc 

P15 

P24 

Ps30 

T24 

T30 

T48 

Wf 

Fan 

-0.08 

-0.42 

-0.36 

-0.81 

-0.17 

-1.99 

-3.88 

-1.02 

LPC 

0.00 

0.00 

-0.12 

-0.03 

-0.16 

-0.02 

0.32 

0.01 

HPC 

0.09 

0.02 

0.08 

-0.16 

0.27 

5.23 

12.32 

0.60 

HPT 

-0.16 

0.04 

0.13 

-1.09 

0.48 

-3.51 

23.47 

1.09 

LPT 

0.15 

-0.03 

-0.15 

0.89 

-0.45 

3.34 

0.12 

0.87 

VSV 

0.20 

0.01 

0.03 

-0.03 

0.08 

1.82 

3.51 

0.17 

VBV 

-0.03 

0.00 

-0.16 

-0.06 

-0.53 

-0.79 

-0.64 

-0.06 

Nf 

-0.34 

-0.72 

-1.06 

-2.31 

-1.93 

-8.68 

-16.52 

-3.16 

Nc 

1 

0 

0 

0 

0 

0 

0 

0 

P15 

0 

1 

0 

0 

0 

0 

0 

0 

P24 

0 

0 

1 

0 

0 

0 

0 

0 

Ps30 

0 

0 

0 

1 

0 

0 

0 

0 

T24 

0 

0 

0 

0 

1 

0 

0 

0 

T30 

0 

0 

0 

0 

0 

1 

0 

0 

T48 

0 

0 

0 

0 

0 

0 

1 

0 

Wf 

0 

0 

0 

0 

0 

0 

0 

1 

P2 

0 

-0.99 

-0.99 

-0.99 

0 

0 

0 

-0.99 

T2 

-0.07 

0.08 

0.12 

0.28 

—1.13 

-1.98 

-2.15 

0.22 
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Example Solution — Metrics 

The diagnostic performance of the given example solution 
was evaluated against EFS generated test cases consisting of 
100 cases of each of the 18 fault types (1800 fault cases total), 
plus an additional 1800 no-fault cases. The corresponding 
metrics in the form of a detection decision matrix, classifica- 
tion confusion matrix, and Kappa Coefficient were calculated 
based on the assessed diagnostic condition produced by the 
example solution, and the “ground truth” conditions produced 
by the EFS. The detection decision matrix, along with 
the True Positive Rate and False Positive Rate is shown in 
figure 14. 

The corresponding classification confusion matrix, percent 
correct classification rate, and average detection latency (in 
terms of the average number of flights) for the same 3600 test 
cases is shown in figure 15. The Kappa Coefficient, calculated 
by equation (4), was found to be 0.77. All metric calculations 
are implemented within a MATLAB routine which automati- 
cally calculates and archives the results given the diagnostic 
assessments in pre-defined format. 


Future Steps 

This paper and the associated ASME Turbo Expo Controls, 
Diagnostics and Instrumentation Committee tutorial are to 
invite feedback from the EFIM community regarding the 
proposed format and content of the gas path diagnostic 
benchmark process. The intent is to define a publicly available 
benchmark problem with metrics that are of value to the entire 
EFIM community: developers, users, and evaluators. Input 
received will be used to make adjustments to the benchmark 
process. The next step will be to disseminate the problem and 


invite solutions. Interested participants will be provided the 
benchmark problem coded in MATLAB, an example solution, 
the defined evaluation metrics, and identical blind test cases 
free of charge. In exchange, as their contribution to the EHM 
community, they will be asked to provide diagnostic assess- 
ments for the provided blind test cases. These blind test case 
diagnostic assessments will be evaluated against the defined 
metrics by the TTCP. Participants will receive their results 
plus the anonymous results of other participants. A follow-on 
limited access workshop will be convened to share solution 
results and lessons learned. The intent is not to formulate this 
as a competition, but rather as a means for the engine health 
management community to share diagnostic approaches. 

Specific additional work is required prior to public release 
of the benchmark problem. Representative distributions for 
sensor noise, operating point variations and other data will be 
incorporated based upon available operational engine data. 
Assessments of metrics will be extended over a wider range of 
diagnostic sets representative of in service experience with 
single and multiple faults. For additional information, and 
updates on the current status of this effort, individuals are 
directed to visit the following website: 

www.grc.nasa.gov/WWW/cdtb/software/ehmbenchmark.html 
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Figure 14. — Example solution metrics — detection decision 
matrix, true positive rate, and false positive rate. 
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Figure 15. — Example solution metrics — classification confusion matrix, percent correctly classified, average detection latency. 
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Conclusion 

A publicly available gas path diagnostic benchmark prob- 
lem has been developed and is being offered to the EHM 
technical community to serve as a common platform for initial 
development and evaluation of candidate gas path diagnostic 
solutions. This public approach towards benchmarking diag- 
nostic systems is focused on providing a common basis of 
comparison of candidate solutions — an area of need as identi- 
fied in recent engine health management technology reviews. 
Members of the engine health management community are 
invited to provide their feedback on the proposed approach to 
ensure that it adequately addresses their areas of interest. An 
international program has been developed to lead this process 
including a follow-on workshop to share results and lessons 
learned. 
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